Tests of homoscedasticity, normality, and missing completely at random for incomplete multivariate data.
نویسندگان
چکیده
Test of homogeneity of covariances (or homoscedasticity) among several groups has many applications in statistical analysis. In the context of incomplete data analysis, tests of homoscedasticity among groups of cases with identical missing data patterns have been proposed to test whether data are missing completely at random (MCAR). These tests of MCAR require large sample sizes n and/or large group sample sizes n(i), and they usually fail when applied to non-normal data. Hawkins (1981) proposed a test of multivariate normality and homoscedasticity that is an exact test for complete data when n(i) are small. This paper proposes a modification of this test for complete data to improve its performance, and extends its application to test of homoscedasticity and MCAR when data are multivariate normal and incomplete. Moreover, it is shown that the statistic used in the Hawkins test in conjunction with a nonparametric k-sample test can be used to obtain a nonparametric test of homoscedasticity that works well for both normal and non-normal data. It is explained how a combination of the proposed normal-theory Hawkins test and the nonparametric test can be employed to test for homoscedasticity, MCAR, and multivariate normality. Simulation studies show that the newly proposed tests generally outperform their existing competitors in terms of Type I error rejection rates. Also, a power study of the proposed tests indicates good power. The proposed methods use appropriate missing data imputations to impute missing data. Methods of multiple imputation are described and one of the methods is employed to confirm the result of our single imputation methods. Examples are provided where multiple imputation enables one to identify a group or groups whose covariance matrices differ from the majority of other groups.
منابع مشابه
Testing equality of covariance matrices when data are incomplete
In the statistics literature, a number of procedures have been proposed for testing equality of several groups’ covariance matrices when data are complete, but this problem has not been considered for incomplete data in a general setting. This paper proposes statistical tests for equality of covariance matrices when data are missing. AWald test (denoted by T1), a likelihood ratio test (LRT) (de...
متن کاملA note on MAR, identifying restrictions, model comparison, and sensitivity analysis in pattern mixture models with and without covariates for incomplete data.
Pattern mixture modeling is a popular approach for handling incomplete longitudinal data. Such models are not identifiable by construction. Identifying restrictions is one approach to mixture model identification (Little, 1995, Journal of the American Statistical Association 90, 1112-1121; Little and Wang, 1996, Biometrics 52, 98-111; Thijs et al., 2002, Biostatistics 3, 245-265; Kenward, Molen...
متن کاملOn the Canonical-Based Goodness-of-fit Tests for Multivariate Skew-Normality
It is well-known that the skew-normal distribution can provide an alternative model to the normal distribution for analyzing asymmetric data. The aim of this paper is to propose two goodness-of-fit tests for assessing whether a sample comes from a multivariate skew-normal (MSN) distribution. We address the problem of multivariate skew-normality goodness-of-fit based on the empirical Laplace tra...
متن کاملEfficient Tests for Normality, Homoscedasticity and Serial Independence of Regression Residuals
‘Classical regression analysis’ assumes the normality (N), homoscedasticity (H) and serial independence (I) of regression residuals. Violation of the normality assumption may lead the investigator to inaccurate inferential statements. Recently, tests for normality have been derived for the case of homoscedastic serially independent (HZ) residuals [e.g., White and Macdonald (1980)]. Similarly, t...
متن کاملMarginal Analysis of A Population-Based Genetic Association Study of Quantitative Traits with Incomplete Longitudinal Data
A common study to investigate gene-environment interaction is designed to be longitudinal and population-based. Data arising from longitudinal association studies often contain missing responses. Naive analysis without taking missingness into account may produce invalid inference, especially when the missing data mechanism depends on the response process. To address this issue in the ana...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Psychometrika
دوره 75 4 شماره
صفحات -
تاریخ انتشار 2010